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ABSTRACT 


Let  {X  .  ,  i  <_  k(n)  ,  n  >_  1}  be  a  triangular  array  of  row-wise 


independent  random  variables.  If  S(x  ,,..., X  .)  is  a  statistic 

nl  n} 

based  on  X  „,..., X  . ,  a  cumulative  process  is  defined  by 
nl  nj 

S  (t)  =  S (X  ,,...,  X  ,  .  ,  1.  The  asymptotic  behavior  of  S  is 
n  nl  nk(n)t  n 

determined  for  S  a  percentile  and  for  S  a  smoothly  weighted  linear 


combination  of  order  statistics. 
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SIGNIFICANCE  AND  EXPLANATION 

Percentiles  and  linear  combinations  of  order  statistics  are  statistics 
which  are  sometimes  preferred  to  averages  because  they  can  be  less  sensitive 
to  the  presence  of  a  few  wild  observations.  It  is  well  known  that  for  large 
samples,  both  percentiles  and  linear  combinations  of  order  statistics 
resemble  averages  in  that  the  appropriately  normalized  statistic  is  approxi¬ 
mately  normally  distributed,  with  parameters  which  depend  on  the  underlying 
distributions. 

This  pap^r  shows  that  percentiles  and  linear  combinations  of  order 
statistics  resemble  averages  in  a  stronger  sense.  It  is  well  known  that  if 
a  sequence  of  averages  is  plotted  against  the  number  of  observations  contrib¬ 
uting  to  the  average,  and  the  resulting  plot  is  rescaled  appropriately, 
then  for  long  sequences  the  picture  will  act  like  a  realization  of  a  Brownian 
motion  path.  This  paper  establishes  that  this  is  still  true  if  the  averages 
are  replaced  by  percentiles  or  by  linear  combinations  of  order  statistics. 

Although  this  resemblance  may  appear  to  be  an  abstract  probabilistic 
theorem,  these  results  can  be  applied  to  estimation  of  monotone  functions. 
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CUMULATIVE  PROCESSED:  LINEAR  COMBINATIONS 
OF  ORDER  STATISTICS  AND  PERCENTILES 


v- 


Sue  Leurgans 

1.  Introduction  and  Notation. 

Let  {X  . ,  1  <  i  <  k|n),  n  >  1}  be  a  triangular  array  of  row-wise  independent 
ni  —  —  — 

random  variables.  Let  S  (y, , _ ,y,  )  denote  a  statistic  based  on  y,,...,y,  .  Define 

l  k  l  k 

the  cumulative  processes  S  on  [0,1]  by  S  (t)  =  SIX  . ,X  ,  ,  ,  ).  For  example, 

n  n  nl  n,k(n)t 

if  Sly.  )  =  y.  +  ...  +  y,  ,  then  S  is  the  familiar  cumulative  sum  orocesses. 

1  k  1  k  n 

It  is  well  known  that  if  {p  .,  1  <  i  <  k{n),  n  >  1}  is  the  triangular  array  of 

ni  -  -  — 

expected  values  of  the  X  ,'s  and  M  is  the  deterministic  process  defined  by 

ni  n 

1/2 

M  (t)  =  Sip  p  ,,..)»  kin)  IS  -  M  )  converges  weakly  to  a  Gaussian  process 

n  nl  n,k(n)t  n  n  ‘ 

which  is  determined  by  the  variances  of  the  X  . ‘s.  (This  conclusion  assumes  only  the 

ni 

existence  of  the  first  two  moments  and  the  validity  of  the  Lindeberg  condition.) 

Since,  in  general,  S  It)  is  a  statistic  for  t  fixed,  the  study  of  the  limiting 
n 

behavior  of  the  process  is  the  study  of  the  limiting  behavior  of  a  sequence  of 

statistics.  It  is  natural  to  suppose  that  (under  appropriate  conditions)  cumulative 
percentile  processes  and  cumulative  linear  function  of  order  statistic  processes  con¬ 
verge  weakly  to  Gaussian  processes.  This  paper  focusses  on  these  two  processes  under 
conditions  which  arise  in  monotone  estimation. 

Neither  of  these  processes  have  certain  convenient  properties  of  the  cumulative 
sum  processes.  The  most  convenient  of  these  properties  is  that  summing  is  a  linear 
operation.  If  S  is  a  linear  function  of  order  statistics  or  a  percentile, 

Sly  +  z)  /  Sly)  +  S(z)  for  arbitrary  y  and  z.  This  causes  the  extension  from  the 
i.i.d.  case  to  the  independent,  nonidentically  distributed  case  to  be  more  complicated 
for  cumulative  percentiles  and  cumulative  linear  functions  of  order  statistics  than 
for  cumulative  sums.  This  paper  shows  that  under  suitable  conditions,  these  convenient 
properties  are  almost  present  in  the  sense  that  cumulative  linear  function  of  order 
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statistic  processes  and  cumulative  percentile  processes  in  C[0,1)  are  asymptotically 
equivalent  to  sums  of  more  tractable  processes.  For  smoothly-weighted  linear  functions 
of  order  statistics,  these  processes  are  a  non-random  element  of  C[0,1]  and  a  cumula¬ 
tive  sum  process.  For  percentile  processes,  a  non-random  function  and  an  empirical 
distribution  function  are  used.  The  empirical  distribution  function  is  the  cumulative 
sum  obtained  by  formal  substitution  of  delta-functions  in  the  results  for  linear  func¬ 
tions  of  order  statistics.  Unfortunately,  this  formal  substitution  does  not  meet  the 
conditions  of  Section  2.  Section  3  therefore  consists  of  an  independent  proof  for 
percentile  processes.  Section  4  compares  the  conditions  imposed  here  with  assumptions 
other  authors  have  imposed  in  related  problems. 

Unless  otherwise  stated,  {x  .,  1  <  i  <  k(n),  n  >  1}  will  be  a  triangular  array 

m  —  —  — 

of  random  variables  such  that  {x  .  ,  1  <  i  <  k(n)}  is  (for  every  n)  a  set  of 

nx  —  — 

independent  random  variables.  F  .  will  be  the  cumulative  distribution  function  (CDF) 

nl  k(n> 

of  X  .  .  F  is  defined  by  F  (x)  =  Y  F  .  (x)A(n).  The  average  CDF  F  can  be 
ni  n  n  ni  n 

thought  of  as  the  CDF  of  a  "randomly  selected"  member  of  {X  .  ,  1  <  i  <  k(n)}.  This 

nl  m  ~  “ 

m 

notation  will  be  extended  to  m  <  k  (n)  by  defining  F  (x)  =  Y  F  .  (x)/m.  For  any 

n / m  . _ .  ni 

i~i  ^ 

set  of  k  numbers,  (x^, .  . .  ,x^ }  j  will  denote  the  m  order  statistic 

of  the  set  of  numbers. 

W  will  denote  a  standard  Wiener  process,  usually  in  C[0,1].  Most  of  the  pro- 

< 

cesses  constructed  in  this  paper  are  to  be  thought  of  as  members  of  C[0,1],  defined 
by  linear  interpolation  between  a  finite  Set  of  points.  This  construction  will  be 
left  implicit. 

Integrals  without  limits  are  integrals  over  [0,1]. 

Additional  notation  will  be  defined  as  needed.  When  quantity  A  is  being  defined 
and  set  equal  to  the  expression  B,  this  will  be  written  either  A  :=  B  or  B  =:  A. 
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2.  Smoothly  Weighted  Linear  Combinations  of  Order  Statistics. 

Let  X  .  denote  (x  X  Define  k(n)  weighted  sums  of  order 

n,m;]  nl  nm  (3) 

statistics  with  weight  function  J  by 

m 


S 

nm 


l 

j=l 


J  (—4-)  X 
m  +  1  n,m;  3 


1  <  m  <  k(n) 


-1/2 

Set  S  .  =  0  and  define  a  random  function  in  C[0,U  by  S  (m/k(n))  =  k(n))  S  , 
nU  n  nm 

0  i  m  i  k (n)  . 

The  first  part  of  the  theorem  of  this  section  will  be  proved  under  the  five 
regularity  assumptions,  A1-A5. 

Al:  J  is  a  real-valued  function  on  the  unit  interval  whose  derivative  J' 

exists  everywhere  on  [0,11  and  satisfies  a  Holder  condition  for  1/2  <  y  <_  1, 

that  is,  there  is  a  constant  K  such  that  |j*  (u)  -  j*  (v)  |  <  K  |u  -  v|\ 

L  L 


sup 


A2: 

The  support  of  J  is  a  1 

A3: 

lim  0c (n) ) 1/4 

max 

n-w 

l<j<k(n)  - 

A4: 

There  exists 

an  open  set 

CDF 

F  such  that 

lim  F  (x)  ■ 
n 

n-xo 

A5: 

{F  ,  n  >  1} 

is  a  tight  1 

The  second  part  of  Theorem  2.1  imposes  two  more  conditions:  one  a  rate  condition  and 
the  other  limiting  discontinuities. 

A6:  There  exists  an  open  set  U  containing  the  support  of  J  such  that 

lim  A  (k(n))'*'//^  <  ",  where  A  =  sip  If  ^(u)-F  1(u)|. 

.  n  n  ..  n 

n->"  uf  u 

A7:  F  and  F  are  strictly  increasing  on  the  support  of  J. 

—  n 

Note  that  conditions  A3,  A4,  A5,  A6  and  A7  are  trivial  if  all  of  the  F  .  are  the  same 

n3 

continuous  strictly  increasing  distribution  function  F. 

The  first  two  assumptions  involve  only  the  weight  function  J,  and  not  the  random 
variables.  Al  is  a  smoothness  condition  for  J',  which  forces  J  to  be  continuously 
differentiable.  If  Y  =  1,  the  Holder  condition  is  a  first-order  Lipschitz  condition. 
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Since  J  is  a  bounded  function  on  a  bounded  interval  it  is  no  loss  of  generality  to 


f, 

[ 

i 


I 


require  y  £  1.  The  existence  of  J1  everywhere  ensures  that  the  Mean  Value  Theorem 
can  be  applied  straight- forwardly .  A2  implies  that  the  weighted  sum  of  order  statistic, 
trims  and  guarantees  the  existence  of  certain  integrals.  The  trimming  implies  that  t:.'- 
arithmetic  mean  does  not  meet  these  assumptions.  The  smoothness  conditions  are-  not  r>  t 
by  percentiles  or  by  trimmed  means.  The  next  two  assumptions  involve  the  random  vari¬ 
ables  only,  and  not  the  weight  functions.  A3  asserts  that,  as  n  gets  large,  the 
distributions  of  the  random  variables  within  the  n^  row  of  the  triangular  array 
approach  each  other  quickly  enough  that  the  nonidentical  nature  of  the  distributions 
within  each  row  does  not  disturb  the  asymptotic  behavior.  A4  asserts  that  the  mean 
distribution  functions  approach  a  limiting  distribution  function,  except  possibly  in 
the  tails.  The  limiting  distribution  determines  the  limiting  variance.  The  tail  condi¬ 
tion  A5  is  used  to  show  that  the  weights  J  ^ )  can  be  replaced  by  j(i).  A5  does 

not  require  that  the  converge  in  the  tails,  only  that  mass  does  not  escape  to 

infinity.  The  first  part  of  Theorem  2.1  does  not  require  any  assumptions  about  the 
rate  of  convergence  in  A4.  A6  is  just  such  a  condition.  A7,  which  does  not  involve 
the  tails  ~f  the  mean  CDF,  is  used  to  ignore  ties. 

Theorem  2.1.  Under  A1-A5,  if  {xni'**"xn  k  (n) ^  are  mutually  independent  for 
every  n,  then 


1 

a 


IS  (•) 


n 


D  (•) 


n 


C  (•)] 
n 


->  W 


where 

00  00 

o2  =  /  /  J (F (x) ) J (F (y) ) F (min(x,y) )  (1  -  F (max (x,y) ) dxdy 

—  OO  —CO 


and  D  and  C  are  defined  by  (1)  and  (2)  respectively, 
n  n 


(1) 


D  f-JE-l  =  f  ^  ) 
n  (n)  ‘  Vk  (n) ' 


/  RU  (u)J’(V  <u))dr  (u)  +  f  R  (u)  J  (u)  dU  (u) 
*  n  nm  nm  nm  *  n  nm 


f  r  m  F* . (u) I  l 

I"--.-)  =  (  |  j  /  R  (u)J(u)du  +  /  G(u)J(u)du  +  /  J(u)  u  -  7  -2J- —  dG(U)|  . 

n  K(r.)'  ’-✓kTTTT  L  n  ;  L  >i  m  .  \ 
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*  v... 


G,  R  »  F* . ,  T  ,  U  and  V  are  defined  below.  If  A6  and  A7  also  hold,  then 
n  nj  nm  nm  nm 

p 

D  - i  0  . 

n 

Sketch  of  proof  of  theorem: 

The  proof  is  based  on  rewriting  the  process  S  as  the  sum  of  ten  explicit  pro- 

n 

cesses.  Five  of  these  processes  will  be  grouped  to  form  D  and  C  .  Four  of  the 

n  n 

processes  are  shown  to  converge  weakly  to  zero.  The  remaining  process  is  shown  to  have 
a  non-degenerate  proper  weak  limit. 

Since  this  proof  is  necessarily  very  intricate,  most  of  the  details  are  suppressed. 
The  proof  consists  of  five  lemmas. 

Throughout  this  section,  G  will  be  used  for  inverse  distribution  functions 

(assumed  left-continuous,  if  there  is  any  ambiguity):  G(u)  =  F  *(u)  and  G  (u)  =  F  ^(u) 

n  n 

Set  R  (u)  =  G  (u)  -  G(u)  .  Thus  A6  assumes  that  (k(n))1^4  sup  |r  (u)  |  is  bounded, 
n  n  ...  n 

ut  0 

Transform  the  X  ,'s  into  (0,11  by  V  .  =  F  (X  . ) .  Denoting  the  corresponding 
ni  ni  n  nr 

distribution  functions  F*.  =  F  . °G  ,  it  follows  that  X  .  =  G  (Y  .)  with  probability 

nr  nr  n  ni  n  ni 

one  and  that  Y  .  =  F  (X  .).  Let  F  be  the  empirical  distribution  function  of 
nm; ]  n  nm; j  nm 

{Y  Y  }.  In  the  course  of  this  proof,  the  empirical  distribution  function  F 

nl  nm  nm 

will  be  compared  with  the  uniform  distribution  function,  and  the  difference  will  be 

rescaled  to  have  a  non-degenerate  limiting  distribution.  Therefore  set 

U  (u)  =  v^rn  ( T  (u)  -  u)  -  In  this  notation,  off  a  null  set, 
nm  nm 

S  =  m  /  jf — T  (u))G  (u)  dr  (u)  .  If  G  is  continuous,  the  null  set  is  empty, 
nm  '  hn  +  1  n  m  n  nm  n 

Since  the  failure  of  the  equality  on  a  null  set  will  not  affect  the  weak  convergence, 
this  possibility  will  not  be  dealt  with  explicitly  in  sequel. 
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Define  the  ten  processes  below  by  linear  interpolation  between  the  following  points: 

M.  (0)  =  0  0  <  j  <  9 

3n  ~  “ 

Mn  (r-f-)  =  7-m-  -  /  [jf-fr  r  (u))  -  J(r  (u) ) ] g  (u)dr  (u)  , 

0n'-k(n)J  /)<  (n)  1  L  '■m  +  1  nm  nm  J  n  nm 

m,  (rrV)  =  /  u  (u)  [j*  (v  (u) )  -  j'(u)]G(u)dr  (u)  . 

ln'-k(n)-1  /k(n)  nm  nm  nm 


(3) 


(4) 


(5) 


(See  Lem-ia  2.1  for  V  .) 

nm 


M,  (t~ -)  =  1  u  (u)  d  [G  (u)  J 1  (u)  ] 

2n  '•k(n)-'  2/k(n)  nm 

M3n^k0TT^ 


- I  G  (Y  ,)J>(Y  .)  . 

2m/k(n)  n3  n] 


= 


—  J  R  ( u) U  (u)J'(V  (u))dr  (u)  . 
/k(n)  n  nm  nm  nm 


:  /  R  (u)  J  (u)  dU  (u)  • 

5n  '•k  (n) J  /k(n)  ‘  n  nm 

/  R  (u)  J  (u)du  . 
n 

r  m  F*  .  (u) 

M  (r-f-m)  =  /  J(u)  u  -  i  — 

7n'-k(n))  /k(n)  m 

/  G(u)J(u)du 


M  f-E-| 

6n  Me  (n)  > 


/k(n) 


dG  ( u) 


m  MM 

Bn'-kCn)-1 


^k  (n) 


(6) 


M  =  Y  .■  ■  ■  ,  where 

9n^k(n)J  ^  /iTidT 


2  .  =  /  J  ( u)  [F*  .  (u)  -  I.v  ,  ,  ]dG (u) 
n ]  ‘  ni  (  Y  ,  _u  } 

n3 


Note  that  all  the  component  processes  are  piecewise  linear  with  corners  at  the  same 

coordinates.  M„  arises  \’hen  the  weights  J (—  are  replaced  by  J  f— 1  .  M,  ,  M_ 

On  '■m  +  1'  'm'  In  2n 

and  M  are  the  processes  defined  in  Guiahi  (1975).  The  assumptions  A1  and  A2  imply 
that  the  total  variation  of  GJ’  is  finite  and  therefore  that  the  integral  in  (3)  is 
finite.  The  integrals  in  (5)  to  (6)  exist  and  are  finite  by  A1  and  A2.  D  and  C 


are  formed  from  M,  through  . 

4n  8n 


-6- 


The  proof  of  this  lemma  is  similar  to  that  of  Lemma  3.5  below.  Once  the  same 
construction  has  been  established  the  connection  with  a  sequence  of  iid  uniforms,  the 
law  of  the  iterated  logarithm  for  Kolmogorov-Smirnov  distances  (see  Csaki  (1968))  and 
Lemma  3.2  are  used  to  complete  the  proof. 

Lemma  2 . 3. 


(7)  max  |m.  (m/k  (n ) )  I  0  0  <  j  <  3  . 

0<m<k(n>  ]n  "  " 

Proof:  The  convergence  (7)  will  be  established  for  each  j  in  turn.  In  order  to 
simplify  the  notation,  we  describe  the  proof  of  this  lemma  for  j  =  0  in  the  case  in 
which  the  support  of  J  is  connected.  In  this  case,  A2  and  A4  imply  that  a^,  a^,  b^ 
and  b^  can  be  chosen  such  that  0  <  a^  <  b^  <  b^  <  <  1 ,  ^ai’a2^  an  °°en  3et 

satisfying  A4  and  [b^,by  i-s  t*le  suPPort  of  J.  The  trimming  assumption  and  the 
Lipschitz  condition  for  J  (implied  by  Al)  can  be  used  to  show  tnat 

lJHrr  r  (u))  - J(r  <u>  )  I 

1  '■m  +  1  nm  1  nm  1 

—  ^  rnm*u>  ^  (m  +  1)  KL^I{ufU)  +  X{T  (a  )>a  +e}  +  I{F  (a  )<a  -c)' 

nm  1  1  nm  2  2 

The  strong  law  of  large  numbers  for  Bernoulli  random  variables  ensures  the  existence 
of  a  fixed  mQ(e)  such  that  for  n  sufficiently  large, 


PUV  (a  )>a  +e}  +  P{r  <a,Xa  -e})KL  *  0  for  some  m’  »0(£>  1  «  i  k(n)  •  < 
nm  1  1  nm  2  2 


It  follows  that  with  probability  at  least  1  -  £ 
K' 


m  (t)  <  .  r 

On  -  /k  (n) 


l<m<m  ( r. ) 


/  ]G  (u)  |  dr  (u)  ]  +  -7-r-— 
n  nm  /k(n) 


sup 


a, <u<a^ 
1  2 


G  ( u )  ' 
n 


Because  m  (e)  is  fixed  and  A3  and  A5  imply  that  the  CDF's  {F  .  1  <  j  <  k(n),  n  1 
U  nj  —  “  — 

are  tight,  the  maximum  of  the  m  (e)  ,  {X  X  .  .  I  ip  uniformly  stochast ical lv 

0  nl  n,m^(c) 

bounded  in  n  and  the  first  term  above  converges  in  probability  to  zero.  A4  implies 
that  sup{G^(u)  :  a^  <  u  <  a^}  is  bounded  uniformly  in  n.  Therefore  with  probability 
1  -  e,  I^Qn^  a  ranc^OTT1  quantity  which  converges  to  zero  in  probability. 

Since  e  can  be  chosen  arbitrarily  small,  M  converges  to  zero  weakly  in  Cfb.11. 
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By  the  Lipschitz  Condition  A1  and  by  the  construction  of  V 


J'(V  (u))  -  J'(u)  <  K  (u)  -  u  ,  0  <  u  <  1  , 

nm  —  L  nm  —  — 

where  K  is  the  Lipschitz  constant.  Substitution  of  this  bound  in  M  gives 
L  In 


i-r 

Km2  b2 

Km(u)  I  I1'™  <r„Ju>  -  u)|Y|G(u)|dr_(u) 


1-Y 

K  (k (n) )  2 

— - T7 T~  sup  (“)  I  sup  lG<u)|  • 

<k(n))  7  0<u_Jl  nm  bl-u-b2 

The  conclusion  (7)  for  j  =  1  follows  immediately  from  Lemma  2.2,  the  continuous 
mapping  theorem,  and  the  fact  that  A1  forces  Y  -  1  to  be  nonpositive. 

Similarly,  the  definition  of  M  (3)  implies  (B)  : 


(8) 


'M2n^k(n)'  '  -  /MnT 


max  sup  U  (u) 
0<m<k(n)  0<u<l  nm 


V(GJ' ) 


where  V  denotes  total  variation.  Since  A2  implies  V(GJ')  is  finite,  Lemma  2.2  and 
the  continuous  mapping  theorem  imply  (7)  for  j  =  2.  By  definition  of  M.  , 


m  i  G  {V  . )  J  '  (Y  .) 


"J 


2*  k (n) 


0<m<k(n)  i=l 


_ SJ_  T 

m  {b^Y^b^} 


A1  implies  that  J'  is  a  continuous  function  on  a  compact  interval,  hence  a  bounded 
function.  A2  implies  G  is  bounded  on  (b  Therefore  each  mean  inside  the 

maximum,  and  the  maximum  itself,  is  bounded.  Since  k(n)  converges  to  infinity,  (7) 
holds  for  j  =  3.  ■ 

Lemma  2.4. 

i  Mq 

o  9n  n-*” 


Proof  of  Lemma  2.4: 

Recall  that  Nt  is  the  normalized  cumulative  sum  of  { 2  . ,  1  <  i  «:  k(n)K  Bv 
9n  ni  —  — 

Prohorov's  generalization  of  Donsker's  theorem  (see  Billingsley  (1968) 


r 


(p.  77,  pr.  10. 1)),  it  suffices  to  show  that  the  Z^'s  satisfy  Lindeberg's  Condition 

and  that  the  random  functions  t  converge  weakly  to  the  identity  function  on  [0,1], 

n 

where  t  (t)  :=  Var  M„  (t)/Var  M.  (1).  Since  Z  .  depends  only  on  X  ., 
n  9n  9n  nj  ng 

{Z  .... ,Z  ,  }  are  mutuallv  independent  random  variables.  A2  implies  that  expec- 

nl  n  k(n) 

tation  and  integration  can  he  interchanged  to  give  the  following  formulae: 

EZ  .  =  /  J(u)[F*.(u!  -  El ,  .  ]  dG  ( u)  =  0  . 

nj  ng  {Ynj-U 

2  .  :=  Var  Z  .  =  17  J  (u)  J  (v)  [F*  .  (u  A  v)  -  F*  .  (u)F*  .  <v)]dG  (u)dG(v)  . 

ng  ng  11  ng  ng  ng 

a2  is  finite,  because  |j(u)J(v)l  dominates  the  integrand  for  all  n  and  j  and 

ng 

A2  implies  that  the  dominating  function  is  integrable  with  respect  to  dG(u)dG(v). 

A3  and  A4  imply  F* . (u  A  v)  -  F*.(u)F*.(v)  converges  to  u  A  v  -  uv  uniformly  in 
ng  ng  ng 

j,  for  u  and  v  in  a  neighborhood  of  the  support  of  J.  Therefore  the  Lebesque 

2 

Dominated  Convergence  'lueorem  implies  that  converges  uniformly  in  g  to 

q2  :=  //  J(u)J(v)  [u  A  v  -  uv]  dG  (u)  dG  ( v)  .  The  uniformity  of  this  convergence  implies 

t  k (n)  2 

that  £  a^/k(n)  converges  uniformly  to  o  t  and  hence  that  T^(t)  converges 

uniformly  to  t. 

2 

It  remains  only  to  verify  the  Lindeberg  Condition  for  the  Z^'s.  Since  o^_. 

2 

is  of  order  k(n),  it  suffices  to  show  that  the  random  variables  Z ^  are  uniformly 
integrable.  Since  both  indicators  and  probabilities  are  bounded,  the  definition  of 


Z  .  implies  that 

ng 


/  Z2.dP  <  4P{  I  Z  ■!  <  t}  //  |j(u)J(v)  I  dG  ( u)  dG  ( v) 

c iz  . i>t}  °g  ~  - 


ng 


Chebychev's  inequality  and  the  uniform  convergence  of  a  imply  that  for  n  suffi- 

n] 

ciently  large  P{|z^_.|  L  2a^t  2  uniformly  in  j.  These  two  inequalities  show 

2 

that  the  Z  are  uniformlv  intearable  and  hence  that  the  Z  .  satisfy  the  Lindeberg 
nj  ‘  rn 

Condition. 
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Lemma  2 . 5 . 


A1-A7  imply  D  - >  0. 

n  n-*» 

Proof.  Since  D  =  M.  +  M.  ,  it  suffices  to  show  M  and  M.  converqe  in  prob- 
-  n  4n  5n  4n  5n 

ability  to  zero.  The  proof  for  M  follows  that  for  in  Lemma  2.3  and  is  omitted. 

4n  On 

Expanding  the  definition  of  M,.  (4)  and  integrating  by  parts  yields 

5n 


m  =  -r3— 

5n'-k(n);  /k(n) 


l  / 

i-1  (Vnj,H 


d(R  J(u)) 


f  ud(R’J(u))  !  . 
J  n 


A  7  implies  that  P  and  F  are  strictly  increasing  and  that  C,  ,  and  K 
n  n  n 

continuous  on  the  support  of  J.  Therefore,  R  J  is  a  continuous  function,  and 

n 

(Y  .#1]  can  be  replaced  by  (Y  .,11  in  (9).  Collecting  terms  after  this  substitu¬ 
te  n] 

tion  and  using  A2  and  A6  gives 

-  SUP  |l>  <U)  |  SUP  lR  <u>!  S«P  lJ<u)| 

5nk(n)  a  <u<a  nm  b,<u<h  n  0-u-l 

1 - 2  1 - 2  - 


^  A  sup  |tJ  (u)  I  sup  |j(u)  1  . 

a  cu<a 
1-  -  2 


Lemma  2.2  can  be  extended  to  imply  (in  the  presence  of  A6)  that  sup  |u  ( u ) j 

nm 

ai-U-a2 

converges  to  zero  in  probability.  Therefore  M  converges  weakly  to  0  in  C[0,1), 
and  the  proof  of  Lemma  2.5  is  complete. 
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3.  Percentiles. 


Let  p  be  a  number  between  0  and  1.  will  denote  the  p^"1  percentile  of  the 

average  CDF  F  ,  that  is,  the  number  such  that  F  (£  )  =  p.  £  (mA(n))  will  denote 
n  n  n  n 

the  p  percentile  of  F  .  £  (m/k(n))  will  be  the  sample  percentile  of 

n ,  m  n 

^  j  i  m)  F  will  be  the  empirical  distribution  function  of  the  same 

nj  n,m 

set  of  random  variables.  Set 


sj(mA<n)>=  l  F.(C  (mA(n))(l  -  F  .  (£  (mA(n)))  . 
n  ni  n  ni  n 

When  no  ambiguity  will  result,  \  (1),  s  (1)  and  F  ,  ,  ,  will  be  referred  to  as  £  ,  s 

n  n  n,k(n)  n  n 

and  F  ,  respecti vely.  The  cumulative  process  discussed  in  this  section  is 


W  (t)  := 

n 


t  k(n)  <£  (t)  -  £  (t) ) 
n  n 


(k  (n) ) 


1/2 


Define 


V  (t)  := 


t  k  (n) 


(k  (n) ) 


1/2 


p-Fn,k(n)t^n(t)) 


f  (t) 
n 


and  D  (t)  :=  W  (t)  -  V  (t)  .  (See  Assumption  P2  below  for  the  definition  of  f  ). 
n  n  n  n 

Theorem  3.2  below  asserts  that  D  converges  weakly  to  0  in  C[0,1],  and  hence  that 

n 

W  inherits  the  asymptotic  behavior  of  V  . 
n  n 

The  first  piece  in  the  proof  of  Theorem  3.2  is  Theorem  3.1,  an  extension  to  tri¬ 
angular  arrays  of  Bahadur's  approximation  of  quantiles.  The  proof  uses  Lemma  3.1,  an 
inequality  proved  in  Hoeffding  (1956).  Corollary  3.1  gives  a  reduction  to  i.i.d. 

sequences.  Lemma  3.3  states  the  finite  dimensional  distributions  of  V  ,  and  Lemma 

n 

3.4  gives  the  tightness  of  V  .  The  proof  of  Lemma  3.5  uses  a  convenient  version  of 

n 

V  and  W  .  Corollary  3.1  to  show  that  their  difference  D  is  tight.  Theorem  3.2 
n  n  n 

gives  the  asymptotic  behavior  of  W  ,  the  cumulative  percentile  process. 

n 

The  following  assumptions  will  be  used  in  the  first  part  of  this  section: 

PI;  1  £  i  £  k(n)}  are  mutually  independent  random  variables  and 


X  .  ~  F  .  . 
ni  ni 
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P2 :  F_  ,  ,  is  continuously  differentiable  at  £  (t)  with  derivative 
—  n ,  k  (n )  t  1  n 

f  (t)  satisfying  0  <  inf  f  (t)  £  sup  f  (t)  < 
n  n 

P3:  lim  k  (n)/ln  n  =  "  and  the  sequence  of  constants  a  satisfies  the  two 

n-xo 

conditions  below: 

f2  (t)a2k(n) 

lim  -2-r - 2 -  >  2p  . 

In  n 

n-x» 

lim  a  =  0 
n 


-4 

P4:  lim  k(n)  (In  n)  =  «> 
n-*°° 


P5:  The  variances  s  (t)  satisfy 

—  n 


S2  (t) 

lim  ■—  =:  0  (t)  >  0 


n-x» 


k  (n) 


lim  sup  |e  (s)  -  £  (t)  |  =  0  . 
n-«»  CKt_fl  n  n 

0<s<l 

P2  reduces  to  the  usual  assumption  of  a  non-zero  density  in  the  i.i.d.  case.  P3 

is  primarily  a  pair  of  growth  conditions  on  the  constants  used  in  the  approxima- 

1/2 

tion  (Theorem  3.1).  P2  and  P4  imply  that  a  =  In  n/((k(n>)  f  (t))  satisfies  P3 

n  n 

1/2 

and  that  0((a  In  n)  )  =  o(l).  P5  imposes  some  regularity  conditions  on  F  .  as 
n  ni 

a  function  of  i.  The  first  condition  of  P5,  although  more  restrictive  then  necessary, 

is  appropriate  for  W  as  defined  here.  Additional  notation  will  be  introduced 
n 

before  Lemma  3.4  and  will  be  used  to  give  a  simpler,  more  restrictive  replacement  for  P5 . 


Theorem  3.1. 

Under  the  assumptions  PI,  P2  and  P3  for  t  =  1 ,  with  probability  1, 

D  (1)  =  0  ( (a  In  n)1/2) . 
n  n 


The  conclusion  of  this  theorem  can  be  reexpressed  as 


(k(n))1/2(£ 


k  n)p  -  k(n)F  (£  )  . 

r  ,  a.s.  r  n  n  _ _  ,  ,  1/2, 

£  )  =  - =-t— -  +  0(  (a  In  n)  ) 

n  f  /k(n)  n 

n 
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-1/2 

In  the  case  of  iid  sequences  (F  .  =  F,  X  .  =  X. ,  k(n)  =  n,  a  ■  n  In  n) ,  this 

ni  ni  1  n 

theorem  is  due  to  Bahadur  (1966).  The  result  was  extended  to  m-dependent  (possibly 
non-stationary)  sequences  by  Sen  (1968).  Since  the  proof  of  Theorem  3.1 
resembles  their  proofs,  only  the  modifications  of  Bahadur's  proof  necessary  for  this 
extension  will  be  given. 

Lemma  3.1. 

If  { X^,  1  £  i  <_  n}  are  independent  random  variables  taking  the  value  1  with  prob- 

n 

ability  p.  and  the  value  0  with  probability  1  -  p. ,  if  S  :=  T  X.  ,  if 

n  1  1  n  i=l  1 

nP  ~  )  P-  end  if  a  <  np  <  b,  then  P{a  <  s  <  b}  is  minimized  for  p  *  p. 
i  n  i 

For  a  proof  of  this  lemma,  see  Hoeffding  (1956) . 

Sketch  of  proof  of  theorem: 

1/2 

Let  b  denote  (a  k(n)/ln  n)  ,  I  denote  the  interval  5  +  a  ,  1 

n  n  n  n  —  n  nr 

denote  the  interval  (?  +  r  a  /b  ,  C  +  (r  +  l)a  /b  1  and  U  the  interval  between 

n  n  n  n  n  n  nr 

i  and  f,  +  r  a  /b  .  The  first  step  of  the  proof  is  to  show  that,  with 
n  n  n  n 

probability  1, 

F  (x)  =  F  <6  )  +  F  <x)  -  F  (C  >  +  0( (a  In  n/k(n)>1/2) 
n  n  n  n  n  n  n 

uniformly  in  I  ,  or  chat 
n 

(10)  H  =  sup  |F  (x)  -  F  (c  )  -  (F  (x)  -  F  (C  )) |  -  0 ( (a  In  n/k(n))1/2)  . 

n^^n  nn  n  nn  n 

n 

Simple  algebra  shows  that 

H  <  max  If  (J  )  -  F  (J  )|  +  max  | F  (I  )|  . 
n  |  |  ,  n  nr  n  nr  I  I  .  n  nr 

r  — bn  Hlbn 

The  differentiability  condition  P2  implies  the  second  term  is 

1/2 

0(a  /b  )  =  0((a  In  n/k(n))  ).  Using  Lemma  3.1,  the  probability  that  the  first  term 

n  n  n 

exceeds  any  number  y^  can  be  bounded  by  the  corresponding  expression  when  F^  -  F^ . 

This  latter  probability  is  itself  bounded  by  a  sum  of  k(n)  probabilities  that  a 

binomial  random  variable  exceeds  y^k(n).  The  binomial  probabilities  are  bounded  as 

in  Bahadur  (1966)  and  the  Borel-Cantelli  Lemma  is  applied  (with  y  =  ya  /b  )  to 

n  n  n 

complete  the  proof  of  (10). 


The  second  step  is  to  show  that  if  q  =  p  k(n)  +  o(k(n)a  ),  then 

n  n 

{X^,  1  i  i  1  k(n)  ■  (q  )  e  1^  for  all  n  sufficiently  large  with  probability  1.  This 
step  follows  from  Lemma  3.1,  Assumption  P3  and  the  Borel-Cantelli  Lemma.  The  remainder 
of  the  proof  of  Theorem  3.1  parallels  Bahadur  (1966). 

Lemma  3.2. 

If  is  a  sequence  of  random  variables  and  t(n)  a  monotone  deterministic  sequence 

converging  to  infinity  such  that  lim  Y  /i)i(n)  <  a  <  «°  with  probability  1,  then,  with 

n  — 

n*"1 

probability  1, 


lim  (max  Y  )/i;(n)  ■'  a  . 


This  lemma  is  based  on  a  problem  in  Chung  (1974)  (p.  237,  pr.  2).  The  proof  is 


Corollary  3.1. 

If  {X^,  i  1}  is  a  sequence  of  independent,  identically  distributed  random  vari¬ 
ables  with  cumulative  distribution  function  F,  if  F  lias  a  continuous  derivative 

f  at  its  pth  percentile  C,  and  if  k(n)  satisfies  P4,  then,  D  converges  weaklv 

n 

in  C  f  0 , 1 )  to  the  zero  process  and  W  converges  weakly  to  (p  (1  -  p))^2W. 

n 

Proof:  Since  W  and  V  are  based  on  a  single  sequence,  defining 

-  n  n 

Y  :=  m  (  (f,  -  O  -  (p  -  F  (f, )  )  ) ,  it  is  easy  to  check  that 
mm  m 

sup  I V  (t)  -W  ( t )  |  =  max  Y 
O't^l  l^nv_k(n) 

1/2  -1 

If  a  is  set  equal  to  (In  n)(k(n)  f(r,))  ,  P4  implies  P3.  Theorem  1  therefore 

n 

-1/4 

implies  that  with  probability  1,  Y^(n)  ~  n  ^(n)  )  =  o(l),  or  equivalently 

that  Y,  ,  ,  (k  (n)  )  ly^4/ln  n  =  0,  a.s.  Lemma  2  implies  that  sup  |w  (t)  -  V  (t),  =  o  (1). 
k(n)  0lt_l  n  n  !' 

Thus  W  -  V  converges  weakly  to  the  zero  process.  Because  r.  ( t )  -  f. ,  V  is  a 
n  n  _x  k(n)t  n  n 

normalized  cumulative  sum  process  (V  (t)  =  (k(n))  £  (p  -  I  ,  _  r,)),  and 

n  i  =  1 

Donsker's  Theorem  implies  that  V  converges  weakly  to  (p(l  -  p) )  ^  W,  a  non- 

n 

degenerate  limit.  Together  with  Stutsky’s  Theorem,  this  implies  the  same  weak  limit 

for  W  . 
n 
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Lemma  3.3. 


Under  the  assumptions  PI,  P2,  P4  and  P5,  the  finite-dimensional  distribution  funot-on 

of  V  converge  to  those  of  Wop.  (See  P5  for  the  definition  of  p.) 
n 

Proof:  V  (t)  can  be  rewritten  as 


t  k  (n)  tP  '  I{X  .  <C  (t)}1  s  (t) 
£  n  l—  n  n 


i=l 


k  (n) 


1/2 


k  (n) 


1/2 


t  k  (n ) 

l 

i=l 


(P 


*{x  .  <c  (t) })-! 
ni-  n 


s  (t) 
n 


The  definition  of  £  (t)  implies  that  EV  (t)  =  0.  Assumption  P5  implies  that  s  (t), 
n  n  n 


t  k(n) 

the  variance  of  £  (p  -  ^j),  becomes  infinite  with  n.  Since  each 

i=l  ni—  n 

summand  is  bounded,  Lindeberg's  condition  for  triangular  arrays  (see  Billingsley  (l.'OS), 

p.  42)  is  satisfied  trivially  for  all  n  sufficiently  large,  and  the  term  in  brackets 

converges  in  distribution  to  the  standard  normal  distribution.  P5  therefore  implies  that 

V^(t)  converges  weakly  to  a  normal  distribution  with  (positive)  variance  r  (t)  . 

It  remains  only  to  show  that  V  (t  )  -  V  (t  )  is  asymptotically  independent  of 

n  2  n  1 

V  (t, ).  An  increment  of  V  has  the  form 
n  1  n 


t^k  (n) 


(11) 


vv - VV  = 


..  t^k (n) 

i=t  k(n)  +  l  P  {Xni-W  1  J  Vx  (t 
1  i=i  ni—  n 


„  .  ,  ..,))  J(X  . (t.)  1  • 
ni—  n  2  ni—  n  1 


,1/2 


.1/2 


(k (n)  )  (k (n)  ) 

The  first  term  of  (11)  is  independent  of  V  (t,),  by  assumption  PI.  The  second  ter 

n  I 

is  (up  to  a  sign  change)  a  normalized  summatio-  of  tjk(n)  independent  Bernoulli 

random  variables,  with  parameters  F  .(a  ,b  ],  where  a  =  £  (t, )  A  t,  ( t  )  and 

ninn  nnl  n2 

b  =  t,  (t,)  V  f,  (t„)  .  The  variance  of  this  normalized  summation  is 
n  n  1  n  2 


t  k(n)  k(n)t  F  (a  ,b  1 

1  F  .(a  ,b  111  -  F  ,(a  ,b  ]l  In  t  k(n)  n  n 

r  ni  n  n _ ni  n  n  .  _ _ 1 _ 

i  2  —  2 

i  =  1  s  s 


k  (n)  t  f  (t  ) 
In  1 


€  (t, )  -  {  (t  )  |  - -  0  . 

n  1  n  2  n^® 
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■s 


The  approximation  follows  from  P2  an!  the  convergence  to  zero  is  a  consequence  of  P2 
and  P5.  Since  the  variance  of  the  second  surrtnation  converges  to  zero,  the  summation 


converges  weakly  to  zero,  V  (t_)  -  V  (t, )  is  asymptotically  independent  of  V  (t, ). 

n  2  n  1  n  1 

The  extension  to  the  joint  distribution  of  (W  (t, ) , . . .  ,W  (t,  ))  is  routine.  ■ 

n  l  n  Jt 

Note  that  if  the  last  condition  of  P5  is  weakened,  asymptotic  independence  does 
not  obtain.  If  £n(t)  exhibits  other  systematic  behavior,  a  Brownian  bridge  component 
may  result. 

The  notation  which  follows  will  be  used  for  the  rest  of  this  section. 

Define  the  stochastic  raajorant  F  and  the  stochastic  minorant  F*  of 

n  n 

{F  .,  1  <  j  <  k(n>}  by  (12)  and  (13) 
nj  —  — 

F  (t)  =  min  F  .  (t) 

U2)  "  1<  j<k  n] 


,,  F'  (t)  =  max  F  .  (t) 

(13)  n  l<j<k  n] 

Let  C  and  £'  be  the  corresponding  percentiles  defined  by  F  (£  )  =  F' (C' )  =  p. 
rx  n  n  n  n  n 

These  definitions  imply  the  following  inequalities: 

(14)  F  1  (u)  >  F  3(u)  >  F'  1  (u)  1  <  i  <  k(n),  0  <  u  <  1 

n  —  ni  —  n  —  —  —  — 


(15)  C  >  £  (t)  >  £',  0  <  t  <  1 

n  —  n  —  n  — 

A  sufficient  condition  for  assumption  P5  with  p  (t)  *  p (1  -  p)t  can  now  be  stated: 


P6:  There  is  a  positive  function  H^,  a  distribution  function  F  and  a  sequence  of 
positive  numbers  5^  such  that  H  and  F  have  positive  derivatives  at  g 
and  the  following  inequalities  hold: 


(a)  F  3(u)  <  F  1  (u)  +  6  H(u) 
n  —  n 


(b)  F_1(u)  >  F_1(u)  -  6  H  (u) 

n  —  n 


(c)  lim  6  k(n)  <  ®  . 
n 

n-*» 
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P6  is  strong  enough  to  imply  that  F  and  P'  satisfy  similar  constraints  in  a 

n  n 

neighborhood  of  In  particular,  P6  implies 


(16)  (F'(5  )  -  p)  V  (p  -  F  (£•)>  =  0(6  )  . 

n  n  n  n  n 

One  further  assumption  will  also  be  used: 

P7:  The  derivative  of  F  at  f  (r.)  satisfies 

lim  sup  If  (t)  -  f  (f.)  |  =  0  . 
n  **>  0  t^_l 

Let  {’J . ,  1  _  j  *  «»}  be  a  sequence  of  independent  uniform  random  variables.  By 

the  independence  of  the  X  ,'s, 

*0 

(17)  <X  .,  1  j  k(n)l  =  (F_1  (U  , ) ,  1  -  j  ••  k  (n)  1  . 

nj  -  -  nj  D  -  - 

* 

The  symbol  will  be  used  to  denote  the  version  of  a  process  based  on 

(F  \  (U  .  )  ,  1  i  k(n)l.  For  example, 
ni]-~ 

i  r  t  k  (n)  i 

V*(t)  =  t  T  (p  -  I  )  (ktnT'Vf  (t)  . 

n  1  i-1  iF_1(U.)  '-,  (tM  ^  n 

ni  l  —  n 

By  (17),  V*  3  vn. 

Lemma  3.4. 

If  Pi  and  P6  hold,  the  sequence  of  probability  measures  on  C(0,11  generated  by  the 

sequence  of  processes  is  tight. 

Proof:  V*  reduces  to 

N 


v’(t>  =  - i 

n  (k(n))1/2  i=l 


t  k  (n) 

l  ,P  ~  hu.  p!  +  IU1.-J>)  "  IlU.-'F  .  ('.  (t) ) 
i=l  i— 1  l—  l—  ni  n 


(The  statement  above  will  hold  only  almost  surely  if  for  some  i,  is  flat  at 

r  (t)).  Donsker's  Theorem  implies  that 
n 

t  k  (n)  <P  “  T;U.  pt> 

y  - m  W  . 

i=l  (k  (n)p  (1  -  p) )  n 

and  hence  that 


1  t  x  \n  ; 

1/2  ^  *P  ~  I(U 

<k  (n)  >  '  i  =  l  i-^ 


-18- 


I 


is  tight.  Set 
(19) 


Yni  "  I^uiiFni(5n(t>^ 


Since  (18)  is  tight,  v*  will  be  tight  if 

t  k  (n) 

I  - 


y  . 

ni 


sV2 


i=l  (k  (n ) ) 

is  tight. 

Yni  is  1  with  probability  (p  -  Fni  (t) )  )+,  -1  with  probability 
(pni(5n(t))  _P)  +  .  and  0  with  probability  1-  Ip  -  Fni  (5  (t) )  | .  The  inequality  (15) 
and  the  definitions  (12)  and  (13)  and  the  approximation  (16)  imply  that  there  is  a 
finite  constant  C  Such  that 


^  ni ^  —  I{p-C6  <U. <p+C6  } 
n—  l-^  n 


therefore 


k  (n)t 

I  Y  ■ 

L  m 


■  K  ni  3c  (n)  I{p-C6  <U,<p+C6  } 

i=i -  <  y  -  n~  n 

(k(n))1/2  i=l  (k(n))1/2 


The  variance  of  Z  is  less  than  2k(n)C6  /k(n)  =  2C6  .  Since  P6  implies 
n  n  n  r 

verges  to  zero,  the  variance  of  Z  converges  to  zero,  Z  is  tight, 

n  n 

(19)  is  tight,  and  V  is  tight. 

n 

Lemma  3.5. 


6  con- 
n 


Assumptions  Pi,  P4,  P6  and  P7  imply  that  the  sequence  of  probability  measures  on  C[0,1) 
generated  by  the  sequence  of  processes  is  tight. 

Proof:  Let  5*(m/k(n))  denote  the  pth  percentile  of  (F  1  (U . ) ,  1  <  i  <  m)  and  F* 
-  n  n  j  —  —  n,m 

the  empirical  CDF  of  the  same  set  of  random  variables.  Hie  proof  of  Lemma  3.5 
consists  of  showing  that  D*  is  bounded  above  and  below  by  the  sum  of  three  tight 
processes,  and  is  therefore  tight.  Only  the  proof  for  the  upper  bound  is  given  here, 
since  the  proof  for  the  lower  bound  is  essentially  identical. 
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Define  D*  =  (k  (n) )  1//2D*  (m/k  (n)  )/m.  This  can  be  reexpressed  as 


D*  =  1*  (m/k  (n) )  -  £  (m/k  (n ) ) 
nm  n  n 


P  -  F*  (5  (m/k  (n ) ) 
_ nm  n _ 

f  (m/k(n)) 
n 


Since  (14)  implies  that  £*(m/k(n))  >  £*(m/k(n)),  substitution  in  D*  (and  adding 

N  —  n  ,m 

a  complicated  form  of  zero)  yields 

(20)  D*  <  [ C*  (m/k  (n) )  -  l  -  (F  (1  )  -  F*  (C  ))/f  (m/k(n))l 

nm  —  n  n  nn  n,mn  n 

+  \l  -  C  (m/k  (n ) )  ]  +  [  (F  (C  )  -  F  (5  (m/k(n))))/f  (m/k(n))] 
n  n  n  n  nm  n  n 

+  1(F*  (l  (m/k  (n ) ) )  -  F*  (l  ))/f  (m/k(n))) 
n,m  n  m,n  n  n 

(21)  =5*  +  tc  -  e  (m/k  (n ) )  I  +  [F*  (5  (m/k(n)))  -  F*  (C  )I/f  . 

nm  n  n  nm  n  n,m  n  n 

where  D*^  denotes  the  quantity  inside  the  first  set  of  square  brackets  on  the  P.HS 

of  (20).  (The  quantity  inside  the  third  set  of  square  brackets  on  the  RHS  of  (20) 

reduces  to  zero. )  We  shall  show  that  each  of  the  terms  in  (21)  generates  to  a  tight 

-1/2 

process  when  multiplied  by  m(k(n))  , 

The  process  generated  by  the  second  term  of  (21)  is  uniformly  bounded  by 

(k  (n )  (f.  -  f,'  )  ,  which  is  itself  bounded  by  2  (k  (n ) )  ^2  6  H(u).  The  third  condition 

n  n  n 

of  P6  ensures  that  this  quantity  is  bounded,  and  hence  that  the  second  term  of  PI) 
generates  a  tight  process. 

The  convergence  of  the  other  two  processes  will  be  inferred  from  the  underlying 

sequence  of  uniform  random  variables.  Set  Y  :=  {U.,  1  <  j  <  m] ,  For  the  first 

m  ]  -  J  -  (mp) 

process,  note  that  the  construction  implies 

(22)  C*  (m/k  (n ) )  -  l  =  F_1(Y  )  -  F_1  (p)  . 

n  n  n  m  n 


Assumption  P6  implies  that  the  right  hand  side  of  (22)  is  less  than 

F  (Y  )  +  H(Y  )  -  (F  ^  (p)  -  6  H(p)).  Therefore  the  process  generated  by  D* 
m  n  m  n  nm 

satisfies  the  following  inequality: 
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m  d 


(23) 


(24) 


(25) 


(K  (n ) ) 1/2  (k(n))1/2 


m  6 


-1  -1 
f  <y  )  -  f  (p>  -  y 

m  i=i 


(h(y  )  +  H(P)) 


^  _  1  -l  -l  ^ 

(F  (Uj)lF  Cp)  } 


f  (?) 


(k  (n)) 


1/2  m 


(k(»))1/2  k(m/^(n))  '  f<k  ji  <P  '  I(r'1(uj)<F-1 


(p) } 


The  first  term  of  the  bound  is  of  the  form  of  the  process  of  Corollary  3.1.  Since 

the  assumptions  of  Corollary  3.1  hold,  this  term  converges  weakly  to  zero  and  is  con¬ 
sequently  tight. 

The  process  determined  by  (24)  is  uniformly  bounded  by 
(k  (n) )  1//2<5n (max (  (fUY^)  |  :  »  <  k(n)|  +  |H(p)|j.  Since,  with  probability  1,  Y^  con¬ 
verges  with  n  to  p,  for  every  positive  e,  there  is  an  integer  and  a  number 

K(  '  £  such  that  P{max{|Yn|  :  n  >  N^}  <  e/2}  and  P{max{|Yn|  :  n  }  >_  K£ }  <  e/2, 

the  sequence  [max  [  |h  (Y  (m)  |  m  ^k(n)],  n  _>  1 }  is  tight.  Assumption  P6  implies  that 
-1/2 

m(>  (k(n))  is  bounded  for  m  <  k(n)  and  therefore  that  the  process  (24)  is  tight, 

n  — 

The  process  generated  by  (25)  contains  a  cumulative  sum  of  i.i.d.  Bernoulli 

-1/2 

random  variables.  Since  this  sum  multiplied  by  (k(n))  converges  to  a  Gaussian 

process,  and  since  Assumption  P7  implies  that  (1/f  (m/k(n))  -  l/f(t))  converges  to 

n 

zero,  the  process  generated  by  (25)  is  also  tight.  Therefore  the  process  corresponding 
to  the  left-hand  side  of  (23)  is  tight. 

For  the  third  process,  note  that 


(26)  m  |f*  <C  (m/k  (n ) ) )  -  F*  (7,  )|(k(n))"1/2  <  k(n)~1/2  {  I.-  „  , 

1=1  n  n  i-F 

Therefore  the  process  generated  by  the  left-hand  side  of  (26)  is  uniformly  bounded 
by  the  random  quantity 


(27) 


k  (n) 


-1/2 


k  (n) 


ill  I{V^n,<Ui^  ’ 
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-- 


The  sum  is  a  binomial  random  variable  with  parameters  k(n)  and  p  -  F  (£  ).  Thus  the 

n  n 

variance  of  the  bound  (27)  is  less  than  (p  -  F  (5  )),  which  converges  to  zero. 

n  n 

Therefore  the  process  generated  by  the  third  term  of  (21)  is  tight. 

This  completes  the  proof  that  the  process  D*  has  an  upper  bound  which  is  tight. 

The  fact  that  D*  has  a  lower  bound  which  is  tight  is  shown  in  exactly  the  same  manner. 

Therefore  D*  itself  is  a  tight  process,  and  since  D*  =  D  .  D  is  tight.  ■ 

n  n  n  n 

Theorem  3.2. 


If  PI,  P2,  P4,  P6  and  P7  hold,  converges  weakly  in  C[0,1]  to  (p (1  -  p))  '  w. 

Proof:  Applying  Theorem  3.1  to  <xnj>  3  ^t^ktn))  for  each  fixed  ^  in  10,1] 

shows  that  the  finite  dimensional  distribution  functions  of  D  are  those  of  the  zero 

n 

process.  By  Lemma  3.5,  converges  weakly  to  zero.  Since  P6  forces  p  (t)  = p  (1  -  p)t, 

1/2 

Lemmas  3.3  and  3.4  imply  that  converges  weakly  to  W  ■>  p  =  (p  (1  -  p) )  w.  Since 


W  =  V  +  D  ,  the  theorem  follows, 
n  n  n 
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Many  papers  have  considered  cumulative  processes  based  on  i.i.d.  sequences.  These 
include  Braun  (1976)  and  Lai  (1975)  for  rank  test  statistics,  Miller  and  Sen  (1972)  for 
U-statistics  and  von  Mises'  differentiable  statistical  functions,  and  Guiahi  (1975)  and 
Ghosh  and  Sen  (1976)  for  linear  combinations  of  order  statistics.  All  of  these  authors 
obtain  limiting  Gaussian  processes.  Lamperti  (1964)  showed  that  normalized  cumulative 
maximum  processes  converge  weakly  to  extremal  processes.  Welsch  (1973)  is  the  third 
in  a  series  of  papers  extending  Lamperti’ s  conclusions  to  certain  strong-mixing 
Gaussian  sequences.  However,  none  of  these  papers  applies  to  non-trivial  triangular 
arrays . 

Some  other  papers  are  less  closely  related  than  their  titles  suggest.  The  strong 

quantile  process  approximation  of  Csorgo  and  Revesz  (for  example,  Csorgo  and  Revesz 

(1978))  is  primarily  concerned  with  quantile  processes  as  a  function  of  p,  rather 

than  cumulative  processes.  Their  method  uses  an  embedding  which  is  suitable  for  the 

i.i.d.  case.  Guiahi  (1975)  and  Sen  (1978)  discuss  a  process  based  on  a  tail-sequence 

of  linear  combinations  of  order  statistics  from  an  i.i.d.  sequence.  Sen  (1979)  derives 

a  Gaussian  limit  for  a  process  based  on  {X, X.  t  .  ).  This  process  differs  from 

(1)  (ntp) 

the  process  considered  here  in  that  the  values  of  the  X's  themselves,  rather  than 
the  values  of  their  indices,  determine  which  random  variables  are  used  to  construct 
the  process. 

The  weaker  conclusion  of  asymptotic  normality  has  been  studied  for  percentiles 
and  smooth  linear  combinations  of  order  statistics  in  the  case  of  independent,  non- 
identically  distributed  random  variables.  The  assumptions  A1-A7  will  therefore  be 
compared  with  the  assumptions  of  Stigler  (1972)  and  Shorack  (1972,  1973),  as  well  as 
those  of  Guiahi.  The  assumptions  P1-P7  will  be  compared  with  those  of  Sen  (1968)  and 
Weiss  (1969). 

Theorem  2.1  is  an  analog  of  Guiahi' s  Theorem  1.  Guiahi  derives  his  version  with 
three  regularity  conditions  and  the  assumption  that  F^  =  F,  for  all  n  and 
1  <  i  <_  n.  A3,  A4,  A5,  and  A6  are  all  immediate  in  this  case.  The  first  regularity 


condition  is  that  the  first  absolute  moment  of  a  random  variable  with  cumulative- 


distribution  F  be  finite.  There  is  no  moment  condition  for  F  in  A1-A7.  The  second 
condition  is  a  smoothness  condition  for  the  weight  function  J.  This  condition  is 
equivalent  to  A1 ,  with  7=1.  A2  is  not  imposed.  The  third  condition  is  that  the 
total  variation  of  the  product  of  G  and  J'  be  finite.  This  condition  is  a  conse¬ 
quence  of  Al  and  A2.  The  proof  here  is  similar  to  that  of  Guiahi,  but  the  trimming 
condition  A2  is  imposed  and  tail  conditions  on  F  are  weakened. 

Stigler  (1974)  uses  Hajek' s  projection  theorem  to  derive  the  asymptotic  normality 
of  linear  combinations  of  order  statistics.  Wesley  (1977)  discusses  Stigler1 s  results, 
gives  a  counterexample  to  the  theorem  as  stated,  and  indicates  a  few  corrections. 

(See  also  Stigler  (1979)).  Since  the  theorems  in  question  concern  the  non-identically 
distributed  case,  the  conditions  are  more  complicated  than  Guiahi’s.  There  are  two 
conditions  on  the  distributions:  a  tightness  condition  and  a  convergence  condition. 

The  tightness  condition  is  the  requirement  that  there  exist  a  finite  number  M,  a 
positive  constant  e,  and  a  cumulative  distribution  function  H  such  that  (28)  and 
(29)  hold. 


(28) 

F  .  (y) 
nk 

1  H  (y) 

y  <  -M 

F  .  (y) 
nk 

>_  H(y) 

y  >  M 

(29) 

lim  xe(l 

-  H  (x) 

+  H(x))  = 

X-w, 

The  convergence  condition  is  that  (30)  and  (31)  hold  for  almost  all  x  and  y. 

(30)  lim  F  (x)  =  F  (x) 

n 


(31) 


n  [Fnk(min(x,y))  -Fnk(x)Fnk(y)l 

lim  )  -  =  K(x,y)  . 

,  ,  n 

n-*=>  k=l 


Condition  (30)  asserts  that  the  average  cumulative  distribution  F^  converges  weakly 
to  F.  The  assumption  (30)  is  stronger  than  A4,  which  is  better  suited  to  the  trimming 
assumption  in  that  the  convergence  is  required  only  for  x  in  a  neighborhood  of  the 
support  of  the  weight  function  J.  A3  and  A4  imply  (30)  for  x  and  y  in  a 
neighborhood  of  the  support  of  J,  with  K(x,y)  =  F(min(x,y))  -  F(x)F(y). 
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Stigler  places  two  conditions  on  the  weight  function  J.  The  first  condition  is 
the  same  as  A2,  which  requires  that  J  trim.  The  second  condition  imposes  some  smooth¬ 
ness  conditions  on  J.  J  is  required  to  be  bounded  and  to  satisfy  a  Holder  condition 
for  y  >  except  at  possibly  finitely  many  points,  each  of  which  is  a  null  set  for 
every  Fn^-  This  condition  allows  J  to  have  a  finite  set  of  discontinuities  as  long 
as  the  corresponding  quantiles  are  well-defined  for  all  the  distributions  f^.  This 
is  a  weaker  condition  than  Al,  which  assumes  J  dif ferentiable  and  imposes  the  Holder 
condition  on  J* .  The  necessity  of  the  condition  that  the  discontinuities  of  J  cor¬ 
respond  to  well-defined  quantiles  was  demonstrated  in  Stigler  (1973),  where  the  limiting 
distribution  of  the  trimmed  mean  was  shown  to  be  non-normal  if  the  trimming  fractions 
correspond  to  ill-defined  percentiles. 

The  final  stages  of  Stigler' s  proof  are  very  similar  to  the  proof  given  in  this 
chapter.  The  asymptotic  normality  is  obtained  from  a  normalized  sum  of  the  random 
variables  Z'  .,  where 

ni 

z;r  !  [Fnj(y)  -  X{X  .<y},J(F(y,)dy  • 

— »  nj— 

The  random  variables  used  in  Section  2  are  Z  where 

ni 

Znj  =  /  J(u)lFnj(u)  -  V  .<u),dG(u)  • 

n]— 


If  the  average  cumulative  distribution  functions  F^  are  all  equal  to  F  and  if  F 
is  strictly  increasing,  ZV  =  znj‘  Otherwise,  the  random  variables  are  not  necessarily 
equal . 

Since  the  use  of  the  projection  theorem  avoids  the  Mean  Value  Theorem,  the  differ¬ 
entiability  conditions  on  J  are  unnecessary.  In  his  proof  that  (in  the  notation  of 

this  chapter)  S  (1)  is  asymptotically  equivalent  to  a  normalized  sum  of  Z*  .,  Stigler 
n  n  j 

avoids  writing  out  the  remainder  term  explicitly.  However,  his  techniques  cannot  be 
extended  to  show  that  a  corresponding  remainder  process  converges  in  probability  to 
zero  in  C[0,1).  The  methods  of  Section  2  are  also  difficult  to  apply,  because 
the  remainder  process  (from  Zj^)  will  not  decompose  conveniently  into  processes  which 
can  be  treated  individually. 
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Shorack  has  derived  the  asymptotic  normality  of  linear  combinations  of  order 

statistics  from  weak  convergence  of  the  empirical  processes.  His  theorems  apply  to 

more  general  statistics  than  Stigler's,  allowing  finitely  many  percentiles  to  receive 

asymptotically  non-negligible  weight  and  including  sums  of  (non-monotone)  functions  of 

order  statistics.  Shorack  (1972)  contains  the  basic  proofs,  but  the  conditions  require 

that  F  .  s  F  ,  1  •  i  «*  n.  In  a  later  paper  (Shorack  (1973)),  a  more  general  theorem 

m  n  —  — 

is  stated.  The  proof  consists  of  a  list  of  substitutions  in  the  earlier  proof.  In  the 
general  case,  one  of  two  triples  of  assumptions  is  required.  The  first  assumption  of 
both  triples  is  that  there  exist  a  finite  M  and  a  positive  ,l>  such  that  (32)  and  (33) 
hold  for  all  u  in  the  unit  interval. 


(32) 

G  (u)  |  _  M[u(l  -  u)) 

(33) 

|g  (u)  ;  m i u < i  -  u) ) 

n  — 

This  condition  is  a  fairly  strict  tightness  condition.  shorack theorems  do  not  imply 
Stigler's  results,  since  the  Cauchy  distribution  satisfies  (29),  but  rv>t  (32)1.  The 
second  basic  assumption  is  that  J  be  continuous.,  except  possibly  at  ill-defined 
quantiles.  The  third  condition  is  that  (34)  hold,  with  the  same  as  above. 

1-  • 

(34)  /  ft  (1  -  t )  1  2  d'c;n  -  t;1  (t)  . 

The  second  assumption  can  be  replaced  by  the  condition  that  J  be  continuous  and  the 
third  condition  can  bo  replaced  by  the  convergence  (36),  for  all  u  such  th.it  >'•  is 
continuous  at  u. 

(35  )  lim  C :  (u)  =  <1  (u)  . 

n 

Condition  (35)  implies  that  the  quantiles  of  Fn  converge  to  the  quantiles  of  F, 
whenever  the  latter  are  well-defined.  This  condition  is  equivalent  to  the  weak 
convergence  (30). 

Neither  Shorack  nor  Stigler  requires  a  condition  resembling  A3,  which  assorts  that 
the  F  approach  each  other  quickly  enough  in  Kolmogorov- Smirnov  distance.  Indeed, 
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such  a  condition  will  not  be  necessary  for  asymptotic  normality.  Consider  the  situation 

in  which,  for  every  n,  half  of  the  F^  equal  and  the  rest  equal  H^.  The  mean 

cumulative  distribution  function  will  converge  to  H,  the  average  of  and  H^. 

The  covariance  function  K  will  also  be  simply  related  to  H^  and  H^.  Therefore,  if 

,  H^,  and  J  are  sufficiently  regular,  the  weighted  linear  combination  of  order 

statistics  will  be  asymptotically  normal.  However,  a  theorem  like  Theorem  2.1  cannot 

hold  without  additional  conditions,  because  the  process  S^  is  not  a  function  of  the 

order  statistics  of  all  n  random  variables,  but  also  depends  on  the  order  statistics 

of  Xn^,...,X  ^or  everY  m  <_  n.  The  asymptotic  behavior  of  the  process  will  depend 

on  which  half  of  the  F  .  are  equal  to  H. .  The  weak  convergence  of  the  process  cannot 
m  l 

be  obtained  without  additional  conditions,  such  as  A3. 

Sen  (1968)  extends  Bahadur  (1966)  tc  m-dependent  sequences.  Wpisl  69)  uses 
moment  generating  function  techniques  to  study  the  joint  asymptotic  behavior  of  several 
sample  percentiles  based  on  a  triangular  array  of  independent,  non- ident  bally  distri¬ 
buted  random  variables.  We  examine  Sen's  conditions  for  independence,  Wr i sr • s  condition 
for  a  single  percentile,  and  P1-P7  with  k(n)  -  n. 

The  most  obvious  difference  is  that  Weiss  weakens  the  first  condition  of  P$  (which 

is  only  slightly  stronger  than  Sen's  assumption  that  inf  (s2  (1 )/n )  0)  to 

n 

2/3  2 

lim(n  /s  (U)  =  0.  The  only  other  differences  between  Weiss  and  Son  are  that  Sen 
n 

requires  F^  to  be  twice  differentiable  and  Weiss  imposes  slightly  more  complicated 

bounds  on  the  various  density  functions.  Since  neither  Weiss  nor  Sen  requires  the 

percentiles  £  to  converge,  Assumption  P6  is  clearly  stronger  than  the  assumptions 
n 

of  Weiss  and  of  Sen.  However,  P1-P7  avoid  their  requirement  that  each  F  have  a 

ni 

density.  The  results  of  Section  3  apply  whenever  the  average  distributions  F  are 
differentiable  in  a  neighborhood  of  £^ . 
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